Title Manifold Learning on Probabilistic Graphical Models

نویسندگان

  • Yuanlong Shao
  • Hujun Bao
  • Xiaofei He
چکیده

Modeling training data is a fundamental problem in machine learning. In this thesis, we put together the two most powerful data modeling techniques, namely manifold learning and statistical modeling, so that the combined method will benefit from the advantages of both approaches. Based on our relevant previous works, this thesis proposed a theoretical framework in terms of backgrounds on Riemannian manifold. We designed two different algorithms to do the optimization and inference, each with different performance guarantees. Beside of that, we also give in depth analysis of the algorithms, including convergence property, convexity, and computational complexity, which greatly enlarged the extent of our approach and broadened its application range. To accelerate the whole process, we also created a general purpose statistical inference engine, namedYASIE (Yet Another Statistical Inference Engine), so that different models can be composed by using building blocks, and the computational cost are carefully refined so that it approaches hand written inference codes. Based on these methodologies and tools, we show our experimental results in two classic machine learning problems. Our approach is particularly good for semi-supervised learningwhere most of the training data instances are not labeled. Related papers are published in top conferences and journals such as ACMMultimedia and IEEE TKDE. Manifold learning assumes that the intrinsic dimensionality is much smaller than the apparent dimensionality of the data. Possible data samples will lie on a low dimensional manifold embedded in the high dimensional ambient space. The task of manifold learning is to get some knowledge of the true structure of the manifold by using the finite number of data samples at hand, so that we can compute and approach the true geometrical properties of the manifold, such as low dimensional embedding, tangent space, Laplace-Beltrami operators, etc. Current manifold learningmethods usually creates adjacency or similarity graphs among the data points, from which an objective III úôŒÆa¬Æ Ø© Abstract function for the optimization of the labels of the data points is induced. The benefit of manifold learning is that it is highly non-parametric, it models non-trivial structures among data effectively and accurately, and sometimes the discretely computed results can even be proved to converge to the continuous case. The problem is that it is difficult to deal with multi-modal data, such as those in image annotation. Besides, it is difficult to introduce prior knowledge and deal with dynamic data in an online setting using manifold learning. Contrarily, statistical methods models data with properly factorized joint distribution. Thanks to the long term accumulation, statistical methods have good solutions for the above mentioned problems. However, the problem of statistical modeling is that it is usually highly parametric. Whether these methods fits data well depends on how well we specify the parametric models. Most computationally tractable and effective models have difficulties dealing with data in non-trivial manifolds. In this thesis, we combine the two approaches in two different manner. One is to append the objective function of statistical learning with a new constraint induced by the adjancy graph for approaching the manifold structure. The added constraint is regarded as a regularization term. Most matured methodologies in this thesis are based on this principle. Another possible direction is to make better use of the probabilistic graphical model, to model the adjacency graph directly, so that the probabilistic dependency relationship can be expressed straightforwardly in the chain graph model. And we can prove that: (i) Some manifold regularization terms can be reinterpreted by a specific form of chain graph representation; (ii) Some chain graph representation of the adjacency graph can be reinterpreted by using manifold regularization. This part of the work is still in exploration.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rule-based joint fuzzy and probabilistic networks

One of the important challenges in Graphical models is the problem of dealing with the uncertainties in the problem. Among graphical networks, fuzzy cognitive map is only capable of modeling fuzzy uncertainty and the Bayesian network is only capable of modeling probabilistic uncertainty. In many real issues, we are faced with both fuzzy and probabilistic uncertainties. In these cases, the propo...

متن کامل

Visual Tracking by Sampling Tree-Structured Graphical Models

Probabilistic tracking algorithms typically rely on graphical models based on the first-order Markov assumption. Although such linear structure models are simple and reasonable, it is not appropriate for persistent tracking since temporal failures by short-term occlusion, shot changes, and appearance changes may impair the remaining frames significantly. More general graphical models may be use...

متن کامل

Applying ISOMAP to the Learning of Hyperspectral Image

In this paper, we present the application of a non-linear dimensionality reduction technique for the learning and probabilistic classification of hyperspectral image. Hyperspectral image spectroscopy is an emerging technique for geological investigations from airborne or orbital sensors. It gives much greater information content per pixel on the image than a normal colour image. This should gre...

متن کامل

مدل ترکیبی تحلیل مؤلفه اصلی احتمالاتی بانظارت در چارچوب کاهش بعد بدون اتلاف برای شناسایی چهره

In this paper, we first proposed the supervised version of probabilistic principal component analysis mixture model. Then, we consider a learning predictive model with projection penalties, as an approach for dimensionality reduction without loss of information for face recognition. In the proposed method, first a local linear underlying manifold of data samples is obtained using the supervised...

متن کامل

Thesis Proposal Parallel Learning and Inference in Probabilistic Graphical Models

Probabilistic graphical models are one of the most influential and widely used techniques in machine learning. Powered by exponential gains in processor technology, graphical models have been successfully applied to a wide range of increasingly large and complex real-world problems. However, recent developments in computer architecture, large-scale computing, and data-storage have shifted the f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010